TwIdw—A Novel Method for Feature Extraction from Unstructured Texts

نویسندگان

چکیده

This research proposes a novel technique for fake news classification using natural language processing (NLP) methods. The proposed technique, TwIdw (Term weight–inverse document weight), is used feature extraction and based on TfIdf, with the term frequencies replaced by depth of words in documents. effectiveness compared to another method—basic TfIdf. Classification models were created random forest feedforward neural networks, within those, three different datasets used. network method KaiDMML dataset showed an increase accuracy up 3.9%. was not as successful only (1%). network, other hand, all datasets. Precision recall measures also confirmed good results, particularly method. has potential be various NLP applications, including problems.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Automatic Ontology Extraction from Unstructured Texts

Construction of the ontology of a specific domain currently relies on the intuition of a knowledge engineer, and the typical output is a thesaurus of terms, each of which is expected to denote a concept. Ontological ‘engineers’ tend to hand-craft these thesauri on an ad-hoc basis and on a relatively smallscale. Workers in the specific domain create their own special language, and one device for...

متن کامل

Multiresolution Feature Extraction from Unstructured Meshes

We present a framework to extract mesh features from unstructured two-manifold surfaces. Our method computes a collection of piecewise linear curves describing the salient features of surfaces, such as edges and ridge lines. We extend these basic techniques to a multiresolution setting which improves the quality of the results and accelerates the extraction process. The framework is semiautomat...

متن کامل

Authorship identification from unstructured texts

Authorship identification is a task of identifying authors of anonymous texts given examples of the writing of authors. The increasingly large volumes of anonymous texts on the Internet enhance the great yet urgent necessity for authorship identification. It has been applied to more and more practical applications including literary works, intelligence, criminal law, civil law, and computer for...

متن کامل

Knowledge Extraction from Texts: a method for extracting predicate-argument structures from texts

1. A i m s o f t h e p r o j e c t The general aim of our project is to improve the quality of existing systems extracting knowledge from texts by introducing refined lexical semantics data. The conlribution of lexical ~mantics to knowledge extraction is not new and has already been demonstrated in a few systems. Our more precise aims are to: propose and show feasability of more radical semanti...

متن کامل

A Novel Feature Extraction Method for Facial Expression Recognition

In this work, a novel facial feature extraction method is proposed for automatic facial expressions recognition, which detecting local texture information, global texture information and shape information of the face automatically to form the facial features. First, Active Appearance Model (AAM) is used to locate facial feature points automatically. Then, the local texture information in these ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Applied sciences

سال: 2023

ISSN: ['2076-3417']

DOI: https://doi.org/10.3390/app13116438